{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to Quantitative Finance\n",
    "\n",
    "Copyright (c) 2019 Python Charmers Pty Ltd, Australia, <https://pythoncharmers.com>. All rights reserved.\n",
    "\n",
    "<img src=\"img/python_charmers_logo.png\" width=\"300\" alt=\"Python Charmers Logo\">\n",
    "\n",
    "Published under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. See `LICENSE.md` for details.\n",
    "\n",
    "Sponsored by Tibra Global Services, <https://tibra.com>\n",
    "\n",
    "<img src=\"img/tibra_logo.png\" width=\"300\" alt=\"Tibra Logo\">\n",
    "\n",
    "\n",
    "## Module 2.1: Hypothesis Testing\n",
    "\n",
    "### 2.1.1 Hypothesis Testing\n",
    "\n",
    "Hypothesis testing is a formal method of testing your assumptions with data and statistics. With hypothesis testing, we create two (or more) competing hypothesis and decide which is more likely. In a variety of circumstances, we consider something \"unlikely\" if it has less than 5% chance of happening, but note this would mean that this happens in 1 in 20 experiments *anyway, just due to chance*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://imgs.xkcd.com/comics/significant.png\" title=\"'So, uh, we did the green study again and got no link. It was probably a--' 'RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!'\" alt=\"Significant\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Above comic from https://xkcd.com/882/ Hint: count how many \"per-colour\" experiments were performed.)\n",
    "\n",
    "A common hypothesis pair is the following:\n",
    "\n",
    "* $H_0$, the Null hypothesis, is that any difference between the data in our sample, and the general population, is caused strictly by chance.\n",
    "* $H_A$, the Alternative hypothesis, is that there is a significant difference.\n",
    "\n",
    "The hypotheses must be mutually exclusive, as in it cannot be that both are true. They do not have to be exhaustive, i.e. account for all possible scenarios, but it is often the case. It is usually easier to compute the probability for $H_0$, i.e. $P(H_0)$ and then just compute $P(H_A) = 1 - P(H_0)$. This only applies if the pair is exhaustive.\n",
    "\n",
    "For example, we might have a sample, say trading firms using statistical analysis for decision making. Our Null hypothesis is that these firms are no different to the *population*, which may be either \"all trading firms\" or \"random people picking stocks\". We set up an experiment (more below) and find that we have a 4% chance that our Null hypothesis would generate results this extreme.\n",
    "\n",
    "We might then say that, because the chance is so low, we reject the Null hypothesis that any different is strictly by chance. \n",
    "\n",
    "Note that in this example, there is still a one in 25 chance of obtaining the result purely through \"luck\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Importantly, this does *not* mean that we accept that using statistical analysis causing trading firms to profit more than normal. There may be some other factors involved that are causing the difference. While this might sound like [weasel words](https://en.wikipedia.org/wiki/Weasel_word), this type of inference happens all the time. For instance, firms using statistical analysis might get their advantage from simply a more careful and thorough analysis, rather than the specific statistical analysis they apply. They might be run by larger firms (who can afford the extra staff to do the statistical analysis and have access to more data), or many other factors.\n",
    "\n",
    "It could also be simply by chance, as the above comic demonstrates.\n",
    "\n",
    "This is a common logical fallacy called *Denying the antecedent*:\n",
    "    \n",
    "    If P, then Q.\n",
    "    Therefore, if not P, then not Q.\n",
    "\n",
    "Note that the conclusion above is **invalid** based solely on the condition. Our statistical tests generally tell us \"not P\", and we are left with a bit more evidence of \"not Q\", but never proof.\n",
    "\n",
    "Be wary of this fallacy. Statistics is often pessimistic in this regard - it rarely tells you that you are correct, it generally just tells you if there is a strong chance you are wrong.\n",
    "\n",
    "\n",
    "Another common pitfall, demonstrated by the above comic, is that when we run multiple tests at the same time, our notion of a probability threshold must change, otherwise it becomes **likely** that we observe a purely-by-chance outcome."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%run setup.ipy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Exercise\n",
    "\n",
    "Suppose we had $n$ independent tests. In each test, there is a 5% likelihood that the effect we are measuring is \"high\", for however that is defined. If we run all $n$ tests, what is the likelihood that *at least one test* measures a positive, high, outcome, for given $n$ values:\n",
    "\n",
    "- Two experiments\n",
    "- Ten experiments\n",
    "- Twenty experiments\n",
    "\n",
    "The above results can be computed with a calculator (or by hand if you are fine with leaving a fraction).\n",
    "\n",
    "#### Extended exercise\n",
    "\n",
    "Write a program with a function that does the following:\n",
    "\n",
    "- Compute 1000 random numbers from a normal distribution $N(0, 1)$\n",
    "- Compute the mean of those values\n",
    "\n",
    "Note the expected mean is 0.\n",
    "\n",
    "Plot a histogram of the means from running this function 10,000 times. How many of the results have a mean of more than 0.166?\n",
    "\n",
    "From your results here, note that even with *purely random data*, we can get very high differences between a same (running our function once) and the general population ($N(0, 1)$), just by chance in our we got our sample."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.09750000000000003\n",
      "0.4012630607616213\n",
      "0.6415140775914581\n"
     ]
    }
   ],
   "source": [
    "#1 - at least 1 = 1 - none\n",
    "def prob(p, n):\n",
    "    return 1-(1-p)**n\n",
    "\n",
    "print(prob(0.05,2))\n",
    "print(prob(0.05,10))\n",
    "print(prob(0.05,20))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjAAAAGdCAYAAAAMm0nCAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAjhElEQVR4nO3de3BU5cHH8V9I2IUIuxEh2WQMEG9cFAFRwlbFC5kkGFErnRZFQYtQMNjBgAJ9Ndw6jYKKl6qMWowdb+iMV1KREAQUF9CUCATIIIUJFDYomCxQCJc87x+dnLoSkMSEzRO+n5kzZc959uxzznGa72zObqKMMUYAAAAWaRXpCQAAANQXAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOjGRnkBTqamp0a5du9S+fXtFRUVFejoAAOA0GGO0f/9+JSUlqVWrk7/P0mIDZteuXUpOTo70NAAAQAPs2LFD559//km3t9iAad++vaT/ngCPxxPh2QAAgNMRCoWUnJzs/Bw/mRYbMLW/NvJ4PAQMAACW+bnbP7iJFwAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1omJ9AQAoCG6TimI9BTqbftjWZGeAtBi8A4MAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADr1Ctg8vLydNVVV6l9+/aKj4/XbbfdprKysrAx119/vaKiosKWsWPHho0pLy9XVlaWYmNjFR8fr4ceekjHjh0LG7Ns2TJdccUVcrvduuiii5Sfn9+wIwQAAC1OvQJm+fLlys7O1qpVq1RYWKijR48qPT1dBw8eDBs3evRo7d6921lmz57tbDt+/LiysrJ05MgRffnll3rttdeUn5+v3NxcZ8y2bduUlZWlG264QSUlJZowYYLuu+8+ffrpp7/wcAEAQEsQU5/BixYtCnucn5+v+Ph4FRcXa+DAgc762NhY+Xy+OvexePFibdy4UUuWLFFCQoL69OmjWbNmafLkyZo+fbpcLpfmzZunlJQUPfnkk5KkHj166IsvvtDcuXOVkZFR32MEAAAtTL0C5qeqqqokSR06dAhb/8Ybb+j111+Xz+fTkCFD9Oijjyo2NlaSFAgE1KtXLyUkJDjjMzIyNG7cOJWWlqpv374KBAJKS0sL22dGRoYmTJhw0rlUV1erurraeRwKhX7JoQFnla5TCiI9BQColwYHTE1NjSZMmKCrr75al112mbP+zjvvVJcuXZSUlKR169Zp8uTJKisr03vvvSdJCgaDYfEiyXkcDAZPOSYUCunQoUNq27btCfPJy8vTjBkzGno4AADAIg0OmOzsbG3YsEFffPFF2PoxY8Y4/+7Vq5cSExM1aNAgbd26VRdeeGHDZ/ozpk6dqpycHOdxKBRScnJyk70eAACInAZ9jHr8+PFauHChPvvsM51//vmnHJuamipJ+vbbbyVJPp9PFRUVYWNqH9feN3OyMR6Pp853XyTJ7XbL4/GELQAAoGWqV8AYYzR+/Hi9//77Wrp0qVJSUn72OSUlJZKkxMRESZLf79f69eu1Z88eZ0xhYaE8Ho969uzpjCkqKgrbT2Fhofx+f32mCwAAWqh6BUx2drZef/11vfnmm2rfvr2CwaCCwaAOHTokSdq6datmzZql4uJibd++XR999JFGjBihgQMH6vLLL5ckpaenq2fPnrr77rv1zTff6NNPP9Ujjzyi7Oxsud1uSdLYsWP1r3/9Sw8//LA2b96sF154Qe+8844efPDBRj58AABgo3oFzIsvvqiqqipdf/31SkxMdJYFCxZIklwul5YsWaL09HR1795dEydO1NChQ/Xxxx87+4iOjtbChQsVHR0tv9+vu+66SyNGjNDMmTOdMSkpKSooKFBhYaF69+6tJ598Uq+88gofoQYAAJKkKGOMifQkmkIoFJLX61VVVRX3wwA/g49RnxnbH8uK9BSAZu90f37zt5AAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFinXgGTl5enq666Su3bt1d8fLxuu+02lZWVhY05fPiwsrOzdd5556ldu3YaOnSoKioqwsaUl5crKytLsbGxio+P10MPPaRjx46FjVm2bJmuuOIKud1uXXTRRcrPz2/YEQIAgBanXgGzfPlyZWdna9WqVSosLNTRo0eVnp6ugwcPOmMefPBBffzxx3r33Xe1fPly7dq1S7fffruz/fjx48rKytKRI0f05Zdf6rXXXlN+fr5yc3OdMdu2bVNWVpZuuOEGlZSUaMKECbrvvvv06aefNsIhAwAA20UZY0xDn/zdd98pPj5ey5cv18CBA1VVVaVOnTrpzTff1G9+8xtJ0ubNm9WjRw8FAgENGDBAn3zyiW6++Wbt2rVLCQkJkqR58+Zp8uTJ+u677+RyuTR58mQVFBRow4YNzmsNGzZMlZWVWrRo0WnNLRQKyev1qqqqSh6Pp6GHCJwVuk4piPQUzgrbH8uK9BSAZu90f37/ontgqqqqJEkdOnSQJBUXF+vo0aNKS0tzxnTv3l2dO3dWIBCQJAUCAfXq1cuJF0nKyMhQKBRSaWmpM+bH+6gdU7uPulRXVysUCoUtAACgZWpwwNTU1GjChAm6+uqrddlll0mSgsGgXC6X4uLiwsYmJCQoGAw6Y34cL7Xba7edakwoFNKhQ4fqnE9eXp68Xq+zJCcnN/TQAABAM9fggMnOztaGDRv09ttvN+Z8Gmzq1Kmqqqpylh07dkR6SgAAoInENORJ48eP18KFC7VixQqdf/75znqfz6cjR46osrIy7F2YiooK+Xw+Z8yaNWvC9lf7KaUfj/npJ5cqKirk8XjUtm3bOufkdrvldrsbcjgAAMAy9XoHxhij8ePH6/3339fSpUuVkpIStr1fv35q3bq1ioqKnHVlZWUqLy+X3++XJPn9fq1fv1579uxxxhQWFsrj8ahnz57OmB/vo3ZM7T4AAMDZrV7vwGRnZ+vNN9/Uhx9+qPbt2zv3rHi9XrVt21Zer1ejRo1STk6OOnToII/HowceeEB+v18DBgyQJKWnp6tnz566++67NXv2bAWDQT3yyCPKzs523kEZO3as/vrXv+rhhx/W73//ey1dulTvvPOOCgr4pAQAAKjnOzAvvviiqqqqdP311ysxMdFZFixY4IyZO3eubr75Zg0dOlQDBw6Uz+fTe++952yPjo7WwoULFR0dLb/fr7vuuksjRozQzJkznTEpKSkqKChQYWGhevfurSeffFKvvPKKMjIyGuGQAQCA7X7R98A0Z3wPDHD6+B6YM4PvgQF+3hn5HhgAAIBIIGAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYJ0G/S0kAED92fh9O3x3DZor3oEBAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHXqHTArVqzQkCFDlJSUpKioKH3wwQdh2++55x5FRUWFLZmZmWFj9u3bp+HDh8vj8SguLk6jRo3SgQMHwsasW7dO1157rdq0aaPk5GTNnj27/kcHAABapHoHzMGDB9W7d289//zzJx2TmZmp3bt3O8tbb70Vtn348OEqLS1VYWGhFi5cqBUrVmjMmDHO9lAopPT0dHXp0kXFxcWaM2eOpk+frpdeeqm+0wUAAC1QTH2fMHjwYA0ePPiUY9xut3w+X53bNm3apEWLFumrr77SlVdeKUl67rnndNNNN+mJJ55QUlKS3njjDR05ckTz58+Xy+XSpZdeqpKSEj311FNhoQMAAM5OTXIPzLJlyxQfH69u3bpp3Lhx2rt3r7MtEAgoLi7OiRdJSktLU6tWrbR69WpnzMCBA+VyuZwxGRkZKisr0w8//FDna1ZXVysUCoUtAACgZWr0gMnMzNTf//53FRUV6fHHH9fy5cs1ePBgHT9+XJIUDAYVHx8f9pyYmBh16NBBwWDQGZOQkBA2pvZx7ZifysvLk9frdZbk5OTGPjQAANBM1PtXSD9n2LBhzr979eqlyy+/XBdeeKGWLVumQYMGNfbLOaZOnaqcnBzncSgUImIAAGihmvxj1BdccIE6duyob7/9VpLk8/m0Z8+esDHHjh3Tvn37nPtmfD6fKioqwsbUPj7ZvTVut1sejydsAQAALVOTB8zOnTu1d+9eJSYmSpL8fr8qKytVXFzsjFm6dKlqamqUmprqjFmxYoWOHj3qjCksLFS3bt107rnnNvWUAQBAM1fvgDlw4IBKSkpUUlIiSdq2bZtKSkpUXl6uAwcO6KGHHtKqVau0fft2FRUV6dZbb9VFF12kjIwMSVKPHj2UmZmp0aNHa82aNVq5cqXGjx+vYcOGKSkpSZJ05513yuVyadSoUSotLdWCBQv0zDPPhP2KCAAAnL3qHTBff/21+vbtq759+0qScnJy1LdvX+Xm5io6Olrr1q3TLbfcoksuuUSjRo1Sv3799Pnnn8vtdjv7eOONN9S9e3cNGjRIN910k6655pqw73jxer1avHixtm3bpn79+mnixInKzc3lI9QAAECSFGWMMZGeRFMIhULyer2qqqrifhjgZ3SdUhDpKaCZ2v5YVqSngLPM6f785m8hAQAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsE5MpCcAtDT8XSEAaHq8AwMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDr1DpgVK1ZoyJAhSkpKUlRUlD744IOw7cYY5ebmKjExUW3btlVaWpq2bNkSNmbfvn0aPny4PB6P4uLiNGrUKB04cCBszLp163TttdeqTZs2Sk5O1uzZs+t/dAAAoEWqd8AcPHhQvXv31vPPP1/n9tmzZ+vZZ5/VvHnztHr1ap1zzjnKyMjQ4cOHnTHDhw9XaWmpCgsLtXDhQq1YsUJjxoxxtodCIaWnp6tLly4qLi7WnDlzNH36dL300ksNOEQAANDSRBljTIOfHBWl999/X7fddpuk/777kpSUpIkTJ2rSpEmSpKqqKiUkJCg/P1/Dhg3Tpk2b1LNnT3311Ve68sorJUmLFi3STTfdpJ07dyopKUkvvvii/u///k/BYFAul0uSNGXKFH3wwQfavHnzac0tFArJ6/WqqqpKHo+noYcI1FvXKQWRngLQaLY/lhXpKeAsc7o/vxv1Hpht27YpGAwqLS3NWef1epWamqpAICBJCgQCiouLc+JFktLS0tSqVSutXr3aGTNw4EAnXiQpIyNDZWVl+uGHH+p87erqaoVCobAFAAC0TI0aMMFgUJKUkJAQtj4hIcHZFgwGFR8fH7Y9JiZGHTp0CBtT1z5+/Bo/lZeXJ6/X6yzJycm//IAAAECz1GI+hTR16lRVVVU5y44dOyI9JQAA0EQaNWB8Pp8kqaKiImx9RUWFs83n82nPnj1h248dO6Z9+/aFjalrHz9+jZ9yu93yeDxhCwAAaJkaNWBSUlLk8/lUVFTkrAuFQlq9erX8fr8kye/3q7KyUsXFxc6YpUuXqqamRqmpqc6YFStW6OjRo86YwsJCdevWTeeee25jThkAAFio3gFz4MABlZSUqKSkRNJ/b9wtKSlReXm5oqKiNGHCBP35z3/WRx99pPXr12vEiBFKSkpyPqnUo0cPZWZmavTo0VqzZo1Wrlyp8ePHa9iwYUpKSpIk3XnnnXK5XBo1apRKS0u1YMECPfPMM8rJyWm0AwcAAPaKqe8Tvv76a91www3O49qoGDlypPLz8/Xwww/r4MGDGjNmjCorK3XNNddo0aJFatOmjfOcN954Q+PHj9egQYPUqlUrDR06VM8++6yz3ev1avHixcrOzla/fv3UsWNH5ebmhn1XDAAAOHv9ou+Bac74HhhECt8Dg5aE74HBmRaR74EBAAA4EwgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHViIj0BAEDz1XVKQaSnUG/bH8uK9BRwBvAODAAAsA4BAwAArEPAAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOsQMAAAwDoEDAAAsA4BAwAArEPAAAAA6xAwAADAOo0eMNOnT1dUVFTY0r17d2f74cOHlZ2drfPOO0/t2rXT0KFDVVFREbaP8vJyZWVlKTY2VvHx8XrooYd07Nixxp4qAACwVExT7PTSSy/VkiVL/vciMf97mQcffFAFBQV699135fV6NX78eN1+++1auXKlJOn48ePKysqSz+fTl19+qd27d2vEiBFq3bq1/vKXvzTFdAEAgGWaJGBiYmLk8/lOWF9VVaW//e1vevPNN3XjjTdKkl599VX16NFDq1at0oABA7R48WJt3LhRS5YsUUJCgvr06aNZs2Zp8uTJmj59ulwuV1NMGQAAWKRJ7oHZsmWLkpKSdMEFF2j48OEqLy+XJBUXF+vo0aNKS0tzxnbv3l2dO3dWIBCQJAUCAfXq1UsJCQnOmIyMDIVCIZWWljbFdAEAgGUa/R2Y1NRU5efnq1u3btq9e7dmzJiha6+9Vhs2bFAwGJTL5VJcXFzYcxISEhQMBiVJwWAwLF5qt9duO5nq6mpVV1c7j0OhUCMdEQAAaG4aPWAGDx7s/Pvyyy9XamqqunTponfeeUdt27Zt7Jdz5OXlacaMGU22fwAA0Hw0+ceo4+LidMkll+jbb7+Vz+fTkSNHVFlZGTamoqLCuWfG5/Od8Kmk2sd13VdTa+rUqaqqqnKWHTt2NO6BAACAZqPJA+bAgQPaunWrEhMT1a9fP7Vu3VpFRUXO9rKyMpWXl8vv90uS/H6/1q9frz179jhjCgsL5fF41LNnz5O+jtvtlsfjCVsAAEDL1Oi/Qpo0aZKGDBmiLl26aNeuXZo2bZqio6N1xx13yOv1atSoUcrJyVGHDh3k8Xj0wAMPyO/3a8CAAZKk9PR09ezZU3fffbdmz56tYDCoRx55RNnZ2XK73Y09XQAAYKFGD5idO3fqjjvu0N69e9WpUyddc801WrVqlTp16iRJmjt3rlq1aqWhQ4equrpaGRkZeuGFF5znR0dHa+HChRo3bpz8fr/OOeccjRw5UjNnzmzsqQIAAEtFGWNMpCfRFEKhkLxer6qqqvh1Es6orlMKIj0F4Ky2/bGsSE8Bv8Dp/vzmbyEBAADrEDAAAMA6BAwAALAOAQMAAKxDwAAAAOs0yV+jBhoLn+gBANSFd2AAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdQgYAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHViIj0BAAAaU9cpBZGeQr1tfywr0lOwDu/AAAAA6xAwAADAOgQMAACwDgEDAACsQ8AAAADrEDAAAMA6BAwAALAOAQMAAKzDF9mdJWz8YicAAE6Gd2AAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYJ1m/THq559/XnPmzFEwGFTv3r313HPPqX///pGeFgAAjcrGr7rY/lhWRF+/2b4Ds2DBAuXk5GjatGn65z//qd69eysjI0N79uyJ9NQAAECENduAeeqppzR69Gjde++96tmzp+bNm6fY2FjNnz8/0lMDAAAR1ix/hXTkyBEVFxdr6tSpzrpWrVopLS1NgUCgzudUV1erurraeVxVVSVJCoVCjT6/y6Z92uj7BADAJk3x8/XH+zXGnHJcswyY77//XsePH1dCQkLY+oSEBG3evLnO5+Tl5WnGjBknrE9OTm6SOQIAcDbzPt20+9+/f7+8Xu9JtzfLgGmIqVOnKicnx3lcU1Ojffv26bzzzlNUVFQEZxYZoVBIycnJ2rFjhzweT6Snc9bh/Ece1yCyOP+RZ+s1MMZo//79SkpKOuW4ZhkwHTt2VHR0tCoqKsLWV1RUyOfz1fkct9stt9sdti4uLq6ppmgNj8dj1X+4LQ3nP/K4BpHF+Y88G6/Bqd55qdUsb+J1uVzq16+fioqKnHU1NTUqKiqS3++P4MwAAEBz0CzfgZGknJwcjRw5UldeeaX69++vp59+WgcPHtS9994b6akBAIAIa7YB87vf/U7fffedcnNzFQwG1adPHy1atOiEG3tRN7fbrWnTpp3wazWcGZz/yOMaRBbnP/Ja+jWIMj/3OSUAAIBmplneAwMAAHAqBAwAALAOAQMAAKxDwAAAAOsQMJbat2+fhg8fLo/Ho7i4OI0aNUoHDhw45XNeeuklXX/99fJ4PIqKilJlZWWj7Pds1ZBzdfjwYWVnZ+u8885Tu3btNHTo0BO+sDEqKuqE5e23327KQ7HC888/r65du6pNmzZKTU3VmjVrTjn+3XffVffu3dWmTRv16tVL//jHP8K2G2OUm5urxMREtW3bVmlpadqyZUtTHoL1Gvsa3HPPPSf8t56ZmdmUh2C1+pz/0tJSDR06VF27dlVUVJSefvrpX7zPZsfASpmZmaZ3795m1apV5vPPPzcXXXSRueOOO075nLlz55q8vDyTl5dnJJkffvihUfZ7tmrIuRo7dqxJTk42RUVF5uuvvzYDBgwwv/rVr8LGSDKvvvqq2b17t7McOnSoKQ+l2Xv77beNy+Uy8+fPN6WlpWb06NEmLi7OVFRU1Dl+5cqVJjo62syePdts3LjRPPLII6Z169Zm/fr1zpjHHnvMeL1e88EHH5hvvvnG3HLLLSYlJeWsP9cn0xTXYOTIkSYzMzPsv/V9+/adqUOySn3P/5o1a8ykSZPMW2+9ZXw+n5k7d+4v3mdzQ8BYaOPGjUaS+eqrr5x1n3zyiYmKijL//ve/f/b5n332WZ0B80v3ezZpyLmqrKw0rVu3Nu+++66zbtOmTUaSCQQCzjpJ5v3332+yuduof//+Jjs723l8/Phxk5SUZPLy8uoc/9vf/tZkZWWFrUtNTTV/+MMfjDHG1NTUGJ/PZ+bMmeNsr6ysNG6327z11ltNcAT2a+xrYMx/A+bWW29tkvm2NPU9/z/WpUuXOgPml+yzOeBXSBYKBAKKi4vTlVde6axLS0tTq1attHr16ma335aoIeequLhYR48eVVpamrOue/fu6ty5swKBQNjY7OxsdezYUf3799f8+fN/9s/Kt2RHjhxRcXFx2Hlr1aqV0tLSTjhvtQKBQNh4ScrIyHDGb9u2TcFgMGyM1+tVamrqSfd5NmuKa1Br2bJlio+PV7du3TRu3Djt3bu38Q/Acg05/5HY55nWbL+JFycXDAYVHx8fti4mJkYdOnRQMBhsdvttiRpyroLBoFwu1wl/ZDQhISHsOTNnztSNN96o2NhYLV68WPfff78OHDigP/7xj41+HDb4/vvvdfz48RO+hTshIUGbN2+u8znBYLDO8bXnufZ/TzUG/9MU10CSMjMzdfvttyslJUVbt27Vn/70Jw0ePFiBQEDR0dGNfyCWasj5j8Q+zzQCphmZMmWKHn/88VOO2bRp0xmazdmpOVyDRx991Pl33759dfDgQc2ZM+esDRi0XMOGDXP+3atXL11++eW68MILtWzZMg0aNCiCM4MNCJhmZOLEibrnnntOOeaCCy6Qz+fTnj17wtYfO3ZM+/btk8/na/DrN9V+bdKU18Dn8+nIkSOqrKwMexemoqLilOc3NTVVs2bNUnV1dYv9myan0rFjR0VHR5/waa1TnTefz3fK8bX/W1FRocTExLAxffr0acTZtwxNcQ3qcsEFF6hjx4769ttvCZgfacj5j8Q+zzTugWlGOnXqpO7du59ycblc8vv9qqysVHFxsfPcpUuXqqamRqmpqQ1+/abar02a8hr069dPrVu3VlFRkbOurKxM5eXl8vv9J51TSUmJzj333LMyXiTJ5XKpX79+YeetpqZGRUVFJz1vfr8/bLwkFRYWOuNTUlLk8/nCxoRCIa1evfqU1+Js1RTXoC47d+7U3r17w6ISDTv/kdjnGRfpu4jRMJmZmaZv375m9erV5osvvjAXX3xx2Ed4d+7cabp162ZWr17trNu9e7dZu3atefnll40ks2LFCrN27Vqzd+/e094v/qch12Ds2LGmc+fOZunSpebrr782fr/f+P1+Z/tHH31kXn75ZbN+/XqzZcsW88ILL5jY2FiTm5t7Ro+tuXn77beN2+02+fn5ZuPGjWbMmDEmLi7OBINBY4wxd999t5kyZYozfuXKlSYmJsY88cQTZtOmTWbatGl1fow6Li7OfPjhh2bdunXm1ltv5WPUp9DY12D//v1m0qRJJhAImG3btpklS5aYK664wlx88cXm8OHDETnG5qy+57+6utqsXbvWrF271iQmJppJkyaZtWvXmi1btpz2Pps7AsZSe/fuNXfccYdp166d8Xg85t577zX79+93tm/bts1IMp999pmzbtq0aUbSCcurr7562vvF/zTkGhw6dMjcf//95txzzzWxsbHm17/+tdm9e7ez/ZNPPjF9+vQx7dq1M+ecc47p3bu3mTdvnjl+/PiZPLRm6bnnnjOdO3c2LpfL9O/f36xatcrZdt1115mRI0eGjX/nnXfMJZdcYlwul7n00ktNQUFB2Paamhrz6KOPmoSEBON2u82gQYNMWVnZmTgUazXmNfjPf/5j0tPTTadOnUzr1q1Nly5dzOjRo6354RkJ9Tn/tf//89PluuuuO+19NndRxpzFn88EAABW4h4YAABgHQIGAABYh4ABAADWIWAAAIB1CBgAAGAdAgYAAFiHgAEAANYhYAAAgHUIGAAAYB0CBgAAWIeAAQAA1iFgAACAdf4fUJLB0G9gIAEAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#2 - \n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "def rand_sample_mean(n):\n",
    "    samp = np.random.randn(n)\n",
    "    return np.mean(samp)\n",
    "\n",
    "mean_list = []\n",
    "for i in range(0,10000):\n",
    "    mean_list.append(rand_sample_mean(1000))\n",
    "    \n",
    "\n",
    "plt.hist(mean_list)\n",
    "\n",
    "(np.array(mean_list) > 0.166).sum()\n",
    "\n",
    "#None, but if we reduce the sample size we get more"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*For solutions, see `solutions/hypothesis_one.py`*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Types of hypotheses\n",
    "\n",
    "Any test of a hypothesis starts with a formal declaration of the hypothesis and its assumptions. Some general assumptions are things like \"tests are independent of each other\", \"sample drawn at random from the population\" and other factors that might reduce any latent causes that aren't being tested.\n",
    "\n",
    "The Null hypothesis is a \"business as normal\" hypothesis. The new medicine doesn't work. The new strategy didn't make a difference to sales. The sample doesn't differ from the population.\n",
    "\n",
    "The Alternative hypothesis is that our intervention caused some change. For instance, the new medicine reduces illness. Sales increased significantly from the new strategy. The sample is different from the population.\n",
    "\n",
    "Normally we are interested in computing some statistic and then identifying what the likelihood is of that statistic having occurred by chance, based on our assumptions.\n",
    "\n",
    "A commonly used method here is a p-test, where we are testing if a mean is different (>, <, $\\neq$) to the population mean, when we assume the mean from a sample is otherwise drawn from a normal distribution. For instance, if we roll 100 dice, we get an expected value of 350, and a normal distribution of results centred around this value:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;31mDocstring:\u001b[0m\n",
      "randint(low, high=None, size=None, dtype=int)\n",
      "\n",
      "Return random integers from `low` (inclusive) to `high` (exclusive).\n",
      "\n",
      "Return random integers from the \"discrete uniform\" distribution of\n",
      "the specified dtype in the \"half-open\" interval [`low`, `high`). If\n",
      "`high` is None (the default), then results are from [0, `low`).\n",
      "\n",
      ".. note::\n",
      "    New code should use the ``integers`` method of a ``default_rng()``\n",
      "    instance instead; please see the :ref:`random-quick-start`.\n",
      "\n",
      "Parameters\n",
      "----------\n",
      "low : int or array-like of ints\n",
      "    Lowest (signed) integers to be drawn from the distribution (unless\n",
      "    ``high=None``, in which case this parameter is one above the\n",
      "    *highest* such integer).\n",
      "high : int or array-like of ints, optional\n",
      "    If provided, one above the largest (signed) integer to be drawn\n",
      "    from the distribution (see above for behavior if ``high=None``).\n",
      "    If array-like, must contain integer values\n",
      "size : int or tuple of ints, optional\n",
      "    Output shape.  If the given shape is, e.g., ``(m, n, k)``, then\n",
      "    ``m * n * k`` samples are drawn.  Default is None, in which case a\n",
      "    single value is returned.\n",
      "dtype : dtype, optional\n",
      "    Desired dtype of the result. Byteorder must be native.\n",
      "    The default value is int.\n",
      "\n",
      "    .. versionadded:: 1.11.0\n",
      "\n",
      "Returns\n",
      "-------\n",
      "out : int or ndarray of ints\n",
      "    `size`-shaped array of random integers from the appropriate\n",
      "    distribution, or a single such random int if `size` not provided.\n",
      "\n",
      "See Also\n",
      "--------\n",
      "random_integers : similar to `randint`, only for the closed\n",
      "    interval [`low`, `high`], and 1 is the lowest value if `high` is\n",
      "    omitted.\n",
      "random.Generator.integers: which should be used for new code.\n",
      "\n",
      "Examples\n",
      "--------\n",
      ">>> np.random.randint(2, size=10)\n",
      "array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random\n",
      ">>> np.random.randint(1, size=10)\n",
      "array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n",
      "\n",
      "Generate a 2 x 4 array of ints between 0 and 4, inclusive:\n",
      "\n",
      ">>> np.random.randint(5, size=(2, 4))\n",
      "array([[4, 0, 2, 1], # random\n",
      "       [3, 2, 2, 0]])\n",
      "\n",
      "Generate a 1 x 3 array with 3 different upper bounds\n",
      "\n",
      ">>> np.random.randint(1, [3, 5, 10])\n",
      "array([2, 2, 9]) # random\n",
      "\n",
      "Generate a 1 by 3 array with 3 different lower bounds\n",
      "\n",
      ">>> np.random.randint([1, 5, 7], 10)\n",
      "array([9, 8, 7]) # random\n",
      "\n",
      "Generate a 2 by 4 array using broadcasting with dtype of uint8\n",
      "\n",
      ">>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8)\n",
      "array([[ 8,  6,  9,  7], # random\n",
      "       [ 1, 16,  9, 12]], dtype=uint8)\n",
      "\u001b[1;31mType:\u001b[0m      builtin_function_or_method"
     ]
    }
   ],
   "source": [
    "np.random.randint?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "324\n"
     ]
    }
   ],
   "source": [
    "dice_rolls = np.random.randint(1, 7, size=100)\n",
    "print(dice_rolls.sum())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "sums = np.array([np.random.randint(1, 7, size=100).sum() for i in range(10000)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAGiCAYAAADjixw0AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAnzElEQVR4nO3dfXRU9Z3H8c+EkPA4EwJmQtYEclaOkoqgQOPgY5ssAbKu1HQtmrWs5RjXJihi1WQLiA9tKHXV0qWwul3AXVzQ3YZFrNSYtKASQwhmeZDHFUysTmI3ZkZAQiC//YOTu04IPsAMSX7zfp1zzyH397v3/u73hDuf3LkPLmOMEQAAgMViunsAAAAAkUbgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADW+9qBZ/PmzbrpppuUkpIil8uldevWOW1tbW16+OGHNWbMGA0cOFApKSn6/ve/rw8//DBkHc3NzcrPz5fb7VZCQoJmzZqlI0eOhPTZsWOHrrvuOvXr10+pqalavHjxue0hAACIel878Bw9elRjx47V0qVLz2g7duyYtm/frvnz52v79u36zW9+o3379umv/uqvQvrl5+dr9+7dKi8v14YNG7R582YVFBQ47cFgUJMnT9aIESNUW1urn//851q4cKGeffbZc9hFAAAQ7Vzn8/JQl8ulsrIyTZ8+/ax9ampq9M1vflPvv/++0tLStGfPHmVkZKimpkYTJkyQJG3cuFHTpk3TBx98oJSUFC1btkw//vGP5ff7FRcXJ0kqLi7WunXrtHfv3nMdLgAAiFKxkd5AIBCQy+VSQkKCJKmqqkoJCQlO2JGk7OxsxcTEqLq6Wt/5zndUVVWl66+/3gk7kpSTk6Of/exn+uSTTzRkyJAzttPa2qrW1lbn5/b2djU3N2vo0KFyuVyR20EAABA2xhh9+umnSklJUUxM+C41jmjgOX78uB5++GHddtttcrvdkiS/36+kpKTQQcTGKjExUX6/3+mTnp4e0sfr9TptXQWe0tJSPfroo5HYDQAAcIE1NDTo4osvDtv6IhZ42tradOutt8oYo2XLlkVqM46SkhLNnTvX+TkQCCgtLU0NDQ1O2AIAAD1bMBhUamqqBg8eHNb1RiTwdISd999/X5WVlSGBIzk5WU1NTSH9T548qebmZiUnJzt9GhsbQ/p0/NzRp7P4+HjFx8efMd/tdhN4AADoZcJ9OUrYn8PTEXYOHDig119/XUOHDg1p9/l8amlpUW1trTOvsrJS7e3tyszMdPps3rxZbW1tTp/y8nJdeumlXX6dBQAA8EW+duA5cuSI6urqVFdXJ0k6dOiQ6urqVF9fr7a2Nn33u9/Vtm3btHr1ap06dUp+v19+v18nTpyQJI0ePVpTpkzRXXfdpa1bt+qtt95SUVGRZsyYoZSUFEnS7bffrri4OM2aNUu7d+/W2rVr9Ytf/CLkKysAAICv6mvflv6HP/xB3/rWt86YP3PmTC1cuPCMi407/P73v9eNN94o6fSDB4uKivTyyy8rJiZGeXl5WrJkiQYNGuT037FjhwoLC1VTU6Nhw4Zp9uzZevjhh7/yOIPBoDwejwKBAF9pAQDQS0Tq8/u8nsPTkxF4AADofSL1+c27tAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAerHdPQAAOBcji18552UPL8oN40gA9Aac4QEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACs97UDz+bNm3XTTTcpJSVFLpdL69atC2k3xmjBggUaPny4+vfvr+zsbB04cCCkT3Nzs/Lz8+V2u5WQkKBZs2bpyJEjIX127Nih6667Tv369VNqaqoWL1789fcOAABA5xB4jh49qrFjx2rp0qVdti9evFhLlizR8uXLVV1drYEDByonJ0fHjx93+uTn52v37t0qLy/Xhg0btHnzZhUUFDjtwWBQkydP1ogRI1RbW6uf//znWrhwoZ599tlz2EUAABDtXMYYc84Lu1wqKyvT9OnTJZ0+u5OSkqIHHnhAP/rRjyRJgUBAXq9XK1eu1IwZM7Rnzx5lZGSopqZGEyZMkCRt3LhR06ZN0wcffKCUlBQtW7ZMP/7xj+X3+xUXFydJKi4u1rp167R3796vNLZgMCiPx6NAICC3232uuwighxpZ/Mo5L3t4UW4YRwIgnCL1+R3Wa3gOHTokv9+v7OxsZ57H41FmZqaqqqokSVVVVUpISHDCjiRlZ2crJiZG1dXVTp/rr7/eCTuSlJOTo3379umTTz7pctutra0KBoMhEwAAgBTmwOP3+yVJXq83ZL7X63Xa/H6/kpKSQtpjY2OVmJgY0qerdXx+G52VlpbK4/E4U2pq6vnvEAAAsII1d2mVlJQoEAg4U0NDQ3cPCQAA9BBhDTzJycmSpMbGxpD5jY2NTltycrKamppC2k+ePKnm5uaQPl2t4/Pb6Cw+Pl5utztkAgAAkMIceNLT05WcnKyKigpnXjAYVHV1tXw+nyTJ5/OppaVFtbW1Tp/Kykq1t7crMzPT6bN582a1tbU5fcrLy3XppZdqyJAh4RwyAACIAl878Bw5ckR1dXWqq6uTdPpC5bq6OtXX18vlcmnOnDl64okntH79eu3cuVPf//73lZKS4tzJNXr0aE2ZMkV33XWXtm7dqrfeektFRUWaMWOGUlJSJEm333674uLiNGvWLO3evVtr167VL37xC82dOzdsOw4AAKJH7NddYNu2bfrWt77l/NwRQmbOnKmVK1fqoYce0tGjR1VQUKCWlhZde+212rhxo/r16+css3r1ahUVFSkrK0sxMTHKy8vTkiVLnHaPx6PXXntNhYWFGj9+vIYNG6YFCxaEPKsHAADgqzqv5/D0ZDyHB7Abz+EB7NQrnsMDAADQExF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1ovt7gEAiF4ji1/p7iEAiBKc4QEAANYj8AAAAOsReAAAgPW4hgdA1Dmfa4cOL8oN40gAXCic4QEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1uNJywDOC288B9AbcIYHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYLe+A5deqU5s+fr/T0dPXv319//ud/rscff1zGGKePMUYLFizQ8OHD1b9/f2VnZ+vAgQMh62lublZ+fr7cbrcSEhI0a9YsHTlyJNzDBQAAUSDsLw/92c9+pmXLlmnVqlX6xje+oW3btunOO++Ux+PRvffeK0lavHixlixZolWrVik9PV3z589XTk6O3n33XfXr10+SlJ+fr48++kjl5eVqa2vTnXfeqYKCAr3wwgvhHjJghfN5iefhRblhHAkA9DxhDzxbtmzRzTffrNzc0wfQkSNH6t///d+1detWSafP7jzzzDOaN2+ebr75ZknS888/L6/Xq3Xr1mnGjBnas2ePNm7cqJqaGk2YMEGS9Mtf/lLTpk3Tk08+qZSUlHAPGwAijlAKdJ+wf6U1adIkVVRUaP/+/ZKk//7v/9abb76pqVOnSpIOHTokv9+v7OxsZxmPx6PMzExVVVVJkqqqqpSQkOCEHUnKzs5WTEyMqquru9xua2urgsFgyAQAACBF4AxPcXGxgsGgLrvsMvXp00enTp3ST37yE+Xn50uS/H6/JMnr9YYs5/V6nTa/36+kpKTQgcbGKjEx0enTWWlpqR599NFw7w4AALBA2M/wvPjii1q9erVeeOEFbd++XatWrdKTTz6pVatWhXtTIUpKShQIBJypoaEhotsDAAC9R9jP8Dz44IMqLi7WjBkzJEljxozR+++/r9LSUs2cOVPJycmSpMbGRg0fPtxZrrGxUePGjZMkJScnq6mpKWS9J0+eVHNzs7N8Z/Hx8YqPjw/37gAAAAuE/QzPsWPHFBMTuto+ffqovb1dkpSenq7k5GRVVFQ47cFgUNXV1fL5fJIkn8+nlpYW1dbWOn0qKyvV3t6uzMzMcA8ZAABYLuxneG666Sb95Cc/UVpamr7xjW/onXfe0VNPPaUf/OAHkiSXy6U5c+boiSee0KhRo5zb0lNSUjR9+nRJ0ujRozVlyhTdddddWr58udra2lRUVKQZM2ZwhxYAAPjawh54fvnLX2r+/Pn64Q9/qKamJqWkpOjuu+/WggULnD4PPfSQjh49qoKCArW0tOjaa6/Vxo0bnWfwSNLq1atVVFSkrKwsxcTEKC8vT0uWLAn3cAEAQBRwmc8/AtkiwWBQHo9HgUBAbre7u4cDRFx3PePlfLbbG3VXrXgOD6JFpD6/eZcWAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrhf22dAC9T7TdaXU+qBXQO3GGBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUiEnj++Mc/6m/+5m80dOhQ9e/fX2PGjNG2bducdmOMFixYoOHDh6t///7Kzs7WgQMHQtbR3Nys/Px8ud1uJSQkaNasWTpy5EgkhgsAACwX9sDzySef6JprrlHfvn316quv6t1339U//MM/aMiQIU6fxYsXa8mSJVq+fLmqq6s1cOBA5eTk6Pjx406f/Px87d69W+Xl5dqwYYM2b96sgoKCcA8XAABEAZcxxoRzhcXFxXrrrbf0xhtvdNlujFFKSooeeOAB/ehHP5IkBQIBeb1erVy5UjNmzNCePXuUkZGhmpoaTZgwQZK0ceNGTZs2TR988IFSUlK+dBzBYFAej0eBQEButzt8Owj0UCOLX+nuISCCDi/K7e4hABdEpD6/w36GZ/369ZowYYL++q//WklJSbryyiv13HPPOe2HDh2S3+9Xdna2M8/j8SgzM1NVVVWSpKqqKiUkJDhhR5Kys7MVExOj6urqLrfb2tqqYDAYMgEAAEgRCDzvvfeeli1bplGjRul3v/ud7rnnHt17771atWqVJMnv90uSvF5vyHJer9dp8/v9SkpKCmmPjY1VYmKi06ez0tJSeTweZ0pNTQ33rgEAgF4q7IGnvb1dV111lX7605/qyiuvVEFBge666y4tX7483JsKUVJSokAg4EwNDQ0R3R4AAOg9wh54hg8froyMjJB5o0ePVn19vSQpOTlZktTY2BjSp7Gx0WlLTk5WU1NTSPvJkyfV3Nzs9OksPj5ebrc7ZAIAAJAiEHiuueYa7du3L2Te/v37NWLECElSenq6kpOTVVFR4bQHg0FVV1fL5/NJknw+n1paWlRbW+v0qaysVHt7uzIzM8M9ZAAAYLnYcK/w/vvv16RJk/TTn/5Ut956q7Zu3apnn31Wzz77rCTJ5XJpzpw5euKJJzRq1Cilp6dr/vz5SklJ0fTp0yWdPiM0ZcoU56uwtrY2FRUVacaMGV/pDi0AAIDPC3vgmThxosrKylRSUqLHHntM6enpeuaZZ5Sfn+/0eeihh3T06FEVFBSopaVF1157rTZu3Kh+/fo5fVavXq2ioiJlZWUpJiZGeXl5WrJkSbiHCwAAokDYn8PTU/AcHkQbnsNjN57Dg2jRa57DAwAA0NMQeAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFgvtrsHAOD/jSx+pbuHAABW4gwPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9HjwIAL3A+TyU8vCi3DCOBOidOMMDAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1ot44Fm0aJFcLpfmzJnjzDt+/LgKCws1dOhQDRo0SHl5eWpsbAxZrr6+Xrm5uRowYICSkpL04IMP6uTJk5EeLgAAsFBEA09NTY3+6Z/+SVdccUXI/Pvvv18vv/yyXnrpJW3atEkffvihbrnlFqf91KlTys3N1YkTJ7RlyxatWrVKK1eu1IIFCyI5XAAAYKmIBZ4jR44oPz9fzz33nIYMGeLMDwQC+vWvf62nnnpK3/72tzV+/HitWLFCW7Zs0dtvvy1Jeu211/Tuu+/q3/7t3zRu3DhNnTpVjz/+uJYuXaoTJ05EasgAAMBSEQs8hYWFys3NVXZ2dsj82tpatbW1hcy/7LLLlJaWpqqqKklSVVWVxowZI6/X6/TJyclRMBjU7t27u9xea2urgsFgyAQAACBJsZFY6Zo1a7R9+3bV1NSc0eb3+xUXF6eEhISQ+V6vV36/3+nz+bDT0d7R1pXS0lI9+uijYRg9AACwTdjP8DQ0NOi+++7T6tWr1a9fv3Cv/qxKSkoUCAScqaGh4YJtGwAA9GxhDzy1tbVqamrSVVddpdjYWMXGxmrTpk1asmSJYmNj5fV6deLECbW0tIQs19jYqOTkZElScnLyGXdtdfzc0aez+Ph4ud3ukAkAAECKQODJysrSzp07VVdX50wTJkxQfn6+8+++ffuqoqLCWWbfvn2qr6+Xz+eTJPl8Pu3cuVNNTU1On/LycrndbmVkZIR7yAAAwHJhv4Zn8ODBuvzyy0PmDRw4UEOHDnXmz5o1S3PnzlViYqLcbrdmz54tn8+nq6++WpI0efJkZWRk6I477tDixYvl9/s1b948FRYWKj4+PtxDBgAAlovIRctf5umnn1ZMTIzy8vLU2tqqnJwc/epXv3La+/Tpow0bNuiee+6Rz+fTwIEDNXPmTD322GPdMVwAANDLuYwxprsHEQnBYFAej0eBQIDredBrjCx+pbuHAAsdXpTb3UMAvrJIfX7zLi0AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsF63vEsLAHDhnM8rS3gtBWzBGR4AAGA9Ag8AALAeX2kBYcYbzwGg5+EMDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsx7u0AABndT7vhju8KDeMIwHOD2d4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPt6UDXTifN0QDAHoezvAAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAed2kBACLifO52PLwoN4wjATjDAwAAogCBBwAAWI/AAwAArEfgAQAA1iPwAAAA64U98JSWlmrixIkaPHiwkpKSNH36dO3bty+kz/Hjx1VYWKihQ4dq0KBBysvLU2NjY0if+vp65ebmasCAAUpKStKDDz6okydPhnu4AAAgCoQ98GzatEmFhYV6++23VV5erra2Nk2ePFlHjx51+tx///16+eWX9dJLL2nTpk368MMPdcsttzjtp06dUm5urk6cOKEtW7Zo1apVWrlypRYsWBDu4QIAgCjgMsaYSG7g448/VlJSkjZt2qTrr79egUBAF110kV544QV997vflSTt3btXo0ePVlVVla6++mq9+uqr+su//Et9+OGH8nq9kqTly5fr4Ycf1scff6y4uLgv3W4wGJTH41EgEJDb7Y7kLqKH4o3nQO/Fc3iiV6Q+vyN+DU8gEJAkJSYmSpJqa2vV1tam7Oxsp89ll12mtLQ0VVVVSZKqqqo0ZswYJ+xIUk5OjoLBoHbv3t3ldlpbWxUMBkMmAAAAKcKBp729XXPmzNE111yjyy+/XJLk9/sVFxenhISEkL5er1d+v9/p8/mw09He0daV0tJSeTweZ0pNTQ3z3gAAgN4qooGnsLBQu3bt0po1ayK5GUlSSUmJAoGAMzU0NER8mwAAoHeI2Lu0ioqKtGHDBm3evFkXX3yxMz85OVknTpxQS0tLyFmexsZGJScnO322bt0asr6Ou7g6+nQWHx+v+Pj4MO8FAACwQdjP8BhjVFRUpLKyMlVWVio9PT2kffz48erbt68qKiqcefv27VN9fb18Pp8kyefzaefOnWpqanL6lJeXy+12KyMjI9xDBgAAlgv7GZ7CwkK98MIL+q//+i8NHjzYuebG4/Gof//+8ng8mjVrlubOnavExES53W7Nnj1bPp9PV199tSRp8uTJysjI0B133KHFixfL7/dr3rx5Kiws5CwOAAD42sIeeJYtWyZJuvHGG0Pmr1ixQn/7t38rSXr66acVExOjvLw8tba2KicnR7/61a+cvn369NGGDRt0zz33yOfzaeDAgZo5c6Yee+yxcA8XAABEgYg/h6e78Bwe8BweoPfiOTzRq9c+hwcAAKC7EXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPUIPAAAwHoEHgAAYD0CDwAAsB6BBwAAWI/AAwAArEfgAQAA1iPwAAAA6xF4AACA9Qg8AADAegQeAABgPQIPAACwXmx3DwAAgM5GFr9yzsseXpQbxpHAFpzhAQAA1iPwAAAA6xF4AACA9Qg8AADAely0jB7tfC5cBACgA2d4AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj8ADAACsR+ABAADWI/AAAADrEXgAAID1CDwAAMB6vFoCEcfrIQAA3Y3AAwCwyvn8kXV4UW4YR4KehK+0AACA9Qg8AADAegQeAABgPQIPAACwHoEHAABYj7u08JVwazmAaMAdXvbiDA8AALAegQcAAFiPwAMAAKxH4AEAANbjouUowUXHABBZXPDcs/XoMzxLly7VyJEj1a9fP2VmZmrr1q3dPSQAANAL9djAs3btWs2dO1ePPPKItm/frrFjxyonJ0dNTU3dPTQAANDLuIwxprsH0ZXMzExNnDhR//iP/yhJam9vV2pqqmbPnq3i4uIvXT4YDMrj8SgQCMjtdkd6uD0eX2kBgJ1s+zosUp/fPfIanhMnTqi2tlYlJSXOvJiYGGVnZ6uqqqrLZVpbW9Xa2ur8HAgEJJ0uXLhd/sjvznnZXY/mdMt2AQB2isTnXHfq2J9wn4/pkYHnT3/6k06dOiWv1xsy3+v1au/evV0uU1paqkcfffSM+ampqREZ47nyPNPdIwAA2MTWz5VPP/1UHo8nbOvrkYHnXJSUlGju3LnOz+3t7Xr//fc1btw4NTQ08LVWJ8FgUKmpqdSmE+pydtSma9Tl7KhN16hL1zrqUl9fL5fLpZSUlLCuv0cGnmHDhqlPnz5qbGwMmd/Y2Kjk5OQul4mPj1d8fHzIvJiY09dku91ufqnOgtp0jbqcHbXpGnU5O2rTNerSNY/HE5G69Mi7tOLi4jR+/HhVVFQ489rb21VRUSGfz9eNIwMAAL1RjzzDI0lz587VzJkzNWHCBH3zm9/UM888o6NHj+rOO+/s7qEBAIBepscGnu9973v6+OOPtWDBAvn9fo0bN04bN24840LmLxIfH69HHnnkjK+6QG3OhrqcHbXpGnU5O2rTNerStUjXpcc+hwcAACBceuQ1PAAAAOFE4AEAANYj8AAAAOsReAAAgPV6XeApLS3VxIkTNXjwYCUlJWn69Onat29fSB+/36877rhDycnJGjhwoK666ir953/+Z0if5uZm5efny+12KyEhQbNmzdKRI0cu5K6E3bJly3TFFVc4D7Py+Xx69dVXnfbjx4+rsLBQQ4cO1aBBg5SXl3fGwx3r6+uVm5urAQMGKCkpSQ8++KBOnjx5oXclrL6oLs3NzZo9e7YuvfRS9e/fX2lpabr33nudd7F1sLEu0pf/znQwxmjq1KlyuVxat25dSJuNtfkqdamqqtK3v/1tDRw4UG63W9dff70+++wzpz0ajzHReuztyqJFi+RyuTRnzhxnXrQegz+vc10u6DHY9DI5OTlmxYoVZteuXaaurs5MmzbNpKWlmSNHjjh9/uIv/sJMnDjRVFdXm//5n/8xjz/+uImJiTHbt293+kyZMsWMHTvWvP322+aNN94wl1xyibntttu6Y5fCZv369eaVV14x+/fvN/v27TN///d/b/r27Wt27dpljDHm7/7u70xqaqqpqKgw27ZtM1dffbWZNGmSs/zJkyfN5ZdfbrKzs80777xjfvvb35phw4aZkpKS7tqlsPiiuuzcudPccsstZv369ebgwYOmoqLCjBo1yuTl5TnL21oXY778d6bDU089ZaZOnWokmbKyMme+rbX5srps2bLFuN1uU1paanbt2mX27t1r1q5da44fP+6sIxqPMdF67O1s69atZuTIkeaKK64w9913nzM/Wo/BHbqqy4U8Bve6wNNZU1OTkWQ2bdrkzBs4cKB5/vnnQ/olJiaa5557zhhjzLvvvmskmZqaGqf91VdfNS6Xy/zxj3+8MAO/QIYMGWL++Z//2bS0tJi+ffual156yWnbs2ePkWSqqqqMMcb89re/NTExMcbv9zt9li1bZtxut2ltbb3gY4+kjrp05cUXXzRxcXGmra3NGBNddTHmzNq888475s/+7M/MRx99dEbgiabafL4umZmZZt68eWftG43HGGM49hpjzKeffmpGjRplysvLzQ033OB8sEf7MfhsdelKpI7Bve4rrc46TnslJiY68yZNmqS1a9equblZ7e3tWrNmjY4fP64bb7xR0ulT0QkJCZowYYKzTHZ2tmJiYlRdXX1Bxx8pp06d0po1a3T06FH5fD7V1taqra1N2dnZTp/LLrtMaWlpqqqqknS6LmPGjAl5uGNOTo6CwaB27959wfchEjrXpSuBQEBut1uxsaefyxkNdZG6rs2xY8d0++23a+nSpV2+xy4aatO5Lk1NTaqurlZSUpImTZokr9erG264QW+++aazTDQeYySOvZJUWFio3NzckGOtpKg/Bp+tLl2J1DG4xz5p+atob2/XnDlzdM011+jyyy935r/44ov63ve+p6FDhyo2NlYDBgxQWVmZLrnkEkmnv2dOSkoKWVdsbKwSExPl9/sv6D6E286dO+Xz+XT8+HENGjRIZWVlysjIUF1dneLi4pSQkBDS3+v1Ovvs9/vPeJJ1x8+21qWzP/3pT3r88cdVUFDgzLO5LtIX1+b+++/XpEmTdPPNN3e5rM21OVtd3n77bUnSwoUL9eSTT2rcuHF6/vnnlZWVpV27dmnUqFFReYyRovvYK0lr1qzR9u3bVVNTc0ab3++P2mPwF9Wls0geg3t14CksLNSuXbtC/rKSpPnz56ulpUWvv/66hg0bpnXr1unWW2/VG2+8oTFjxnTTaC+MSy+9VHV1dQoEAvqP//gPzZw5U5s2beruYXW7s9Xl86EnGAwqNzdXGRkZWrhwYfcN9gI7W20OHjyoyspKvfPOO909xG5xtrq0t7dLku6++27n3X5XXnmlKioq9C//8i8qLS3tzmFH3Bf9X4rmY29DQ4Puu+8+lZeXq1+/ft09nB7j69Ql4sfgc/5CrpsVFhaaiy++2Lz33nsh8w8ePGgknXHRZVZWlrn77ruNMcb8+te/NgkJCSHtbW1tpk+fPuY3v/lNZAd+gWVlZZmCggJTUVFhJJlPPvkkpD0tLc089dRTxhhj5s+fb8aOHRvS/t577xlJIRcd2qCjLh2CwaDx+XwmKyvLfPbZZyF9o6kuxvx/be677z7jcrlMnz59nEmSiYmJMTfccIMxJrpq01GXjv3713/915D2W2+91dx+++3GmOg8xkT7sbesrMxIOuP/S8f/oddffz0qj8FfVpeTJ08aYy7MMbjXXcNjjFFRUZHKyspUWVmp9PT0kPZjx45JkmJiQnetT58+zl9mPp9PLS0tqq2tddorKyvV3t6uzMzMCO/BhdXe3q7W1laNHz9effv2VUVFhdO2b98+1dfXO9+/+3w+7dy5U01NTU6f8vJyud3uLr/+6c066iKd/qti8uTJiouL0/r168/4KySa6iL9f22Ki4u1Y8cO1dXVOZMkPf3001qxYoWk6KpNR11GjhyplJSUMx6HsX//fo0YMUJSdB5jov3Ym5WVpZ07d4b8f5kwYYLy8/Odf0fjMfjL6tKnT58Ldww+3/R2od1zzz3G4/GYP/zhD+ajjz5ypmPHjhljjDlx4oS55JJLzHXXXWeqq6vNwYMHzZNPPmlcLpd55ZVXnPVMmTLFXHnllaa6utq8+eabZtSoUb3+1sji4mKzadMmc+jQIbNjxw5TXFxsXC6Xee2114wxp2+JTEtLM5WVlWbbtm3G5/MZn8/nLN9x69/kyZNNXV2d2bhxo7nooot6/S2RX1SXQCBgMjMzzZgxY8zBgwdDfqc6/vKwtS7GfPnvTGc6y23pttXmy+ry9NNPG7fbbV566SVz4MABM2/ePNOvXz9z8OBBZx3RdoyJ5mPv2XS+Gylaj8Gdfb4uF/IY3OsCj6QupxUrVjh99u/fb2655RaTlJRkBgwYYK644oozbpX83//9X3PbbbeZQYMGGbfbbe68807z6aefXuC9Ca8f/OAHZsSIESYuLs5cdNFFJisrK+SD67PPPjM//OEPzZAhQ8yAAQPMd77zHfPRRx+FrOPw4cNm6tSppn///mbYsGHmgQcecG4N7K2+qC6///3vz/o7dejQIWcdNtbFmC//nemsc+Axxs7afJW6lJaWmosvvtgMGDDA+Hw+88Ybb4S0R+MxJlqPvWfTOfBE6zG4s8/X5UIeg13GGHMup6kAAAB6i153DQ8AAMDXReABAADWI/AAAADrEXgAAID1CDwAAMB6BB4AAGA9Ag8AALAegQcAAFiPwAMAAKxH4AEAANYj8AAAAOsReAAAgPX+D4gmJU2FBN+BAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.hist(sums, bins=30);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we suspected a dice to be \"loaded\", that is, it is more likely to come up with a given number, we might run an experiment. We might suspect this dice to roll higher numbers more frequently than lower numbers. Our hypothesis is:\n",
    "\n",
    "$H_0$: The dice has a true expected value of 3.5\n",
    "\n",
    "$H_A$: The dice has an expected value of significantly more than 3.5\n",
    "\n",
    "\n",
    "We roll this suspect dice 100 times, and compute the mean. We get a total value of 410. We then do this experiment 3 more times, getting values of 420, 400, 405. This is unlikely, as it is an expected value of around 4.1. This would fit with our hypothesis, but how unlikely is it? We will look into this further in a later module, but for now, we can look up this value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy import stats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "t stat: 13.7602, p value: 0.000831\n"
     ]
    }
   ],
   "source": [
    "t_stat, p_val = stats.ttest_1samp([410, 420, 400, 405], 350)\n",
    "print('t stat: {:.4f}, p value: {:4f}'.format(t_stat, p_val))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The t-stat, while important, isn't really of value here. It is an intermediate statistic we then use to compute the p-value, which is what is normally needed here. The p-value is simply a probability, which you can get in any valid method you can think of. In this case, it is a statistical test \"what is the probability of getting a t stat of this size by chance?\".\n",
    "\n",
    "#### Exercise\n",
    "\n",
    "1. If we assume a confidence level of 0.05 qualifies as \"significant\", what can we say about our hypothesis?\n",
    "2. As a percentage, what is the likelihood of the given samples being obtained by chance?\n",
    "3. Can we suggest that the dice rolls values of 6 more frequently than other values?\n",
    "4. Review the documentation for all of the `scipy.stats.ttest_???` functions and identify when each would be needed.\n",
    "\n",
    "\n",
    "Note there are also T tests in the `statsmodels` package, which can be called in a similar way.\n",
    "\n",
    "\n",
    "#### Extended Exercise\n",
    "\n",
    "Create an alternative hypothesis and experiment to address question 3 above. How can we test if a dice rolls 6s more frequently?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. Reject the null hypothesis, the dice is biased or more specifically the mean is significantly different to 350.\n",
    "2. 0.08%\n",
    "3. Not necessarily rolls more 6's\n",
    "4. One sample t test used above"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*For solutions, see `solutions/inferring_statistics.py`*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Common problems and issues\n",
    "\n",
    "### Large P values\n",
    "\n",
    "If you do not get a lower p-value, you do not have a strong result. We cannot accept the null hypothesis, we just \"fail to reject\" it.\n",
    "\n",
    "If we were to run our above statistic and get a p value of 0.2 (assuming a threshold of significance of 0.05), we do not \"accept the null hypothesis\", we simply fail to reject it at our significance level.\n",
    "\n",
    "\n",
    "### Thresholds\n",
    "\n",
    "It is important also to ensure the p value threshold is set *before* the experiment, and not after you get your results. If you do not, it is tempting to just \"change\" the threshold after you get your results. This removes any independence you had in your result, and the experimenter is effectively arbitrarily changing the result - why bother doing the test in this case?\n",
    "\n",
    "A common value for the threshold is 0.05. There is no basis in this value, it is just \"what is generally used\". If your hypothesis is a matter of life or death, this value is probably too high. If it is of no great consequence, it might be too low. Reason about the value before blindly accepting on.\n",
    "\n",
    "Also note that people are generally terrible dealing with low probabilities. The difference between a false positive of 0.02 and one of 0.01 is \"one in fifty\" compared to \"one in one hundred\". That's twice as likely for the first. Typically, it can be better to estimate with an intuitive \"one in ...\" amount, then convert that to a percentage.\n",
    "\n",
    "\n",
    "### Multiple Simultaneous experiments\n",
    "\n",
    "With modern computers able to run simulations continuously, and at scale, a common issue arising is that the p-value thresholds most commonly used (i.e. 0.05) are only valid for individual tests.\n",
    "\n",
    "For example, suppose we had the following hypothesis:\n",
    "\n",
    "$H_0$: Stock price changes are random for IBM on Mondays\n",
    "\n",
    "$H_A$: Stock prices are more likely to drop on Mondays.\n",
    "\n",
    "\n",
    "We might test this hypothesis and get a p value of 0.2, indicating there is insufficient evidence to reject the null hypothesis, and therefore do not automatically buy on Tuesdays after the drop.\n",
    "\n",
    "We might then consider \"does this pattern hold on any other day?\". So we consider the same hypothesis, but for Tuesday, Wednesday, Thursday and Friday. We find the following significance levels:\n",
    "\n",
    "* Monday: 0.2\n",
    "* Tuesday: 0.1\n",
    "* Wednesday: 0.8\n",
    "* Thursday: 0.04\n",
    "* Friday: 0.5\n",
    "\n",
    "Aha! Thursdays have a *highly significant* effect. We fail to reject all other null hypothesis, accept the Thursday one, and start trading. \n",
    "\n",
    "What went wrong?\n",
    "\n",
    "\n",
    "#### Exercise\n",
    "\n",
    "1. Research the \"Multiple comparisons problem\" and identify a way to fix our hypothesis. We still want to check if any day has a significant value for our hypothesis, but we want to do it in a rigorous way.\n",
    "2. Does our finding hold after adjusting? The solution uses one specific method of fixing the thresholds - if you choose another, then you may get another answer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. The more tests we do, the more likely we'll reject a hypothesis by chance. Hence, adjust the P-value for increased number of tests, for example the Bonferroni adjustment.\n",
    "2. No it would not hold."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*For solutions, see `solutions/multiple_comparisons.py`*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Simulations\n",
    "\n",
    "A topic we will get into in more detail later, but a useful one to touch on here, is the use of simulations for computing p values. When testing a hypothesis, you can use a simulation of your null hypothesis, and then with that simulation, estimate the likelihood of findings like your sample. For instance:\n",
    "\n",
    "$H_0$: The AUD/USD change is a random walk (that is, there is no pattern)\n",
    "\n",
    "$H_A$: The AUD/USD change tends to follow the previous change with lag 1 (that is, if the previous change was up, it is more likely this change will be up too)\n",
    "\n",
    "\n",
    "A little more formally, if we ignore \"no change\", we might say that $p$, the proportion of changes that are consistent with the previous change, is 50%, when we have:\n",
    "\n",
    "$H_0: p = 0.5$\n",
    "\n",
    "$H_A: p > 0.5$\n",
    "\n",
    "\n",
    "To do this, we might analyse some data and find that we get a proportion with a value of 0.6, that is, the proportion of times that a change follows the previous change is 0.6 (60%). Is this \"significant\"?\n",
    "\n",
    "\n",
    "<div class=\"alert alert-warning\">That value above is artificial - we compute the real data in the exercise later</div>\n",
    "\n",
    "To test this, we can create a simulation. In this simulation, we are focused on our null hypothesis - that there is no relationship between a change and the previous one. We might run our experiment for one year's worth of data (i.e. 365 changes), with each change randomly and uniformly chosen from \"up\" or \"down\". We then measure the proportion values to get our result from the simulation. Repeat many times, and you can then use this to estimate the p-value, or the probability that the null hypothesis is true.\n",
    "\n",
    "\n",
    "#### Exercises\n",
    "\n",
    "1. Download the USD/AUD prices from Quandl\n",
    "2. Identify whether each change is \"up\" or \"down\" and compute the sample proportion value (it was 0.6 in the artificial data above)\n",
    "\n",
    "#### Extended Exercise\n",
    "\n",
    "1. Create and run the simulation mentioned above, where we simulate a random walk scenario and compute the proportion of times a change corresponds with the previous change.\n",
    "2. Run the simulation many times\n",
    "3. Compute the p value and determine whether to accept or reject the null hypothesis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "t stat: 0.1080, p value: 0.914005\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAi8AAAGhCAYAAACphlRxAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAgyElEQVR4nO3df2xV9f3H8VcLtPyQe5sC7aWj/BCVHwqIILX+VhooID8GywZjCIbARooLNirUIA72jWVqxGgQ3CagmYgzmRgxYrAISCigNR0I0kBXBAe3IE1bykah9PP9Y+HMa8vwtvdyedPnIzkJPefc08/5eDg8vb23N8455wQAAGBEfKwHAAAAEA7iBQAAmEK8AAAAU4gXAABgCvECAABMIV4AAIApxAsAADCFeAEAAKYQLwAAwBTiBQAAmBJWvOTn5+v2229Xx44dlZKSogkTJqikpCRkn/vvv19xcXEhy29+85uQfY4cOaIxY8aoffv2SklJ0RNPPKG6urrmnw0AALjmtQ5n561btyonJ0e333676urq9NRTT2nEiBHav3+/OnTo4O03a9YsLVmyxPu6ffv23p8vXLigMWPGKBAIaMeOHTp+/LgefvhhtWnTRs8++2wETgkAAFzL4przwYwnT55USkqKtm7dqnvvvVfSf555ufXWW/XSSy81+piPPvpIDz30kI4dO6bU1FRJ0sqVKzV//nydPHlSCQkJl/2+9fX1OnbsmDp27Ki4uLimDh8AAFxBzjmdPn1aaWlpio9vxitXXDMcPHjQSXJ79+711t13332uc+fOrlOnTu7mm292CxYscGfOnPG2P/30027QoEEhx/nHP/7hJLkvv/yy0e9z9uxZV1VV5S379+93klhYWFhYWFgMLkePHm1Ofriwfmz0ffX19Zo3b57uuusu3XLLLd76X/7yl+rRo4fS0tK0Z88ezZ8/XyUlJfrb3/4mSQoGg94zLhdd/DoYDDb6vfLz87V48eIG648ePSqfz9fUUwAAAFdQdXW10tPT1bFjx2Ydp8nxkpOTo6+++krbt28PWT979mzvzwMGDFDXrl01fPhwlZaWqnfv3k36Xnl5ecrNzfW+vnjyPp+PeAEAwJjmvuSjST9wmjt3rjZs2KBPP/1U3bp1+5/7ZmRkSJIOHTokSQoEAiovLw/Z5+LXgUCg0WMkJiZ6oUKwAADQsoUVL845zZ07V++99542b96sXr16XfYxxcXFkqSuXbtKkjIzM7V3716dOHHC22fTpk3y+Xzq379/OMMBAAAtUFg/NsrJydHatWv1/vvvq2PHjt5rVPx+v9q1a6fS0lKtXbtWo0ePVqdOnbRnzx499thjuvfeezVw4EBJ0ogRI9S/f39NmzZNzz33nILBoBYuXKicnBwlJiZG/gwBAMA1Jay3Sl/qZ1SrV6/WjBkzdPToUf3qV7/SV199pTNnzig9PV0//elPtXDhwpAf9XzzzTeaM2eOtmzZog4dOmj69OlaunSpWrf+cS1VXV0tv9+vqqoqfoQEAIARkfr3u1m/5yVWiBcAAOyJ1L/ffLYRAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwJazPNgKAK6Hngg9jPYSwHV46JtZDAFoMnnkBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArvNgKACOAdUsCVwzMvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMCWseMnPz9ftt9+ujh07KiUlRRMmTFBJSUnIPmfPnlVOTo46deqk6667TpMmTVJ5eXnIPkeOHNGYMWPUvn17paSk6IknnlBdXV3zzwYAAFzzwoqXrVu3KicnRzt37tSmTZt0/vx5jRgxQmfOnPH2eeyxx/TBBx/o3Xff1datW3Xs2DFNnDjR237hwgWNGTNG586d044dO/TGG29ozZo1WrRoUeTOCgAAXLPinHOuqQ8+efKkUlJStHXrVt17772qqqpSly5dtHbtWv3sZz+TJB04cED9+vVTYWGh7rjjDn300Ud66KGHdOzYMaWmpkqSVq5cqfnz5+vkyZNKSEi47Petrq6W3+9XVVWVfD5fU4cP4CrVc8GHsR5Ci3B46ZhYDwEtTKT+/W7Wa16qqqokScnJyZKkoqIinT9/XllZWd4+ffv2Vffu3VVYWChJKiws1IABA7xwkaSRI0equrpa+/bta/T71NbWqrq6OmQBAAAtU5Pjpb6+XvPmzdNdd92lW265RZIUDAaVkJCgpKSkkH1TU1MVDAa9fb4fLhe3X9zWmPz8fPn9fm9JT09v6rABAIBxTY6XnJwcffXVV1q3bl0kx9OovLw8VVVVecvRo0ej/j0BAMDVqXVTHjR37lxt2LBB27ZtU7du3bz1gUBA586dU2VlZcizL+Xl5QoEAt4+u3fvDjnexXcjXdznhxITE5WYmNiUoQIAgGtMWM+8OOc0d+5cvffee9q8ebN69eoVsn3IkCFq06aNCgoKvHUlJSU6cuSIMjMzJUmZmZnau3evTpw44e2zadMm+Xw+9e/fvznnAgAAWoCwnnnJycnR2rVr9f7776tjx47ea1T8fr/atWsnv9+vmTNnKjc3V8nJyfL5fHr00UeVmZmpO+64Q5I0YsQI9e/fX9OmTdNzzz2nYDCohQsXKicnh2dXAADAZYUVLytWrJAk3X///SHrV69erRkzZkiSli1bpvj4eE2aNEm1tbUaOXKkXn31VW/fVq1aacOGDZozZ44yMzPVoUMHTZ8+XUuWLGnemQAAgBahWb/nJVb4PS/AtY3f83Jl8HtecKVdFb/nBQAA4EojXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACY0jrWAwAQXT0XfBjrIQBARPHMCwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwJO162bdumsWPHKi0tTXFxcVq/fn3I9hkzZiguLi5kyc7ODtmnoqJCU6dOlc/nU1JSkmbOnKmamppmnQgAAGgZwo6XM2fOaNCgQVq+fPkl98nOztbx48e95e233w7ZPnXqVO3bt0+bNm3Shg0btG3bNs2ePTv80QMAgBandbgPGDVqlEaNGvU/90lMTFQgEGh029dff62NGzfq888/19ChQyVJr7zyikaPHq0XXnhBaWlp4Q4JAAC0IFF5zcuWLVuUkpKiPn36aM6cOTp16pS3rbCwUElJSV64SFJWVpbi4+O1a9euRo9XW1ur6urqkAUAALRMEY+X7OxsvfnmmyooKNAf/vAHbd26VaNGjdKFCxckScFgUCkpKSGPad26tZKTkxUMBhs9Zn5+vvx+v7ekp6dHetgAAMCIsH9sdDmTJ0/2/jxgwAANHDhQvXv31pYtWzR8+PAmHTMvL0+5ubne19XV1QQMAAAtVNTfKn399derc+fOOnTokCQpEAjoxIkTIfvU1dWpoqLikq+TSUxMlM/nC1kAAEDLFPV4+fbbb3Xq1Cl17dpVkpSZmanKykoVFRV5+2zevFn19fXKyMiI9nAAAIBxYf/YqKamxnsWRZLKyspUXFys5ORkJScna/HixZo0aZICgYBKS0v15JNP6oYbbtDIkSMlSf369VN2drZmzZqllStX6vz585o7d64mT57MO40AAMBlhf3MyxdffKHBgwdr8ODBkqTc3FwNHjxYixYtUqtWrbRnzx6NGzdON910k2bOnKkhQ4bos88+U2JioneMt956S3379tXw4cM1evRo3X333frjH/8YubMCAADXrLCfebn//vvlnLvk9o8//viyx0hOTtbatWvD/dYAAAB8thEAALCFeAEAAKZE/Pe8AABs6Lngw1gPIWyHl46J9RBwFeCZFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwJex42bZtm8aOHau0tDTFxcVp/fr1Idudc1q0aJG6du2qdu3aKSsrSwcPHgzZp6KiQlOnTpXP51NSUpJmzpypmpqaZp0IAABoGcKOlzNnzmjQoEFavnx5o9ufe+45vfzyy1q5cqV27dqlDh06aOTIkTp79qy3z9SpU7Vv3z5t2rRJGzZs0LZt2zR79uymnwUAAGgxWof7gFGjRmnUqFGNbnPO6aWXXtLChQs1fvx4SdKbb76p1NRUrV+/XpMnT9bXX3+tjRs36vPPP9fQoUMlSa+88opGjx6tF154QWlpac04HQAAcK2L6GteysrKFAwGlZWV5a3z+/3KyMhQYWGhJKmwsFBJSUleuEhSVlaW4uPjtWvXrkaPW1tbq+rq6pAFAAC0TBGNl2AwKElKTU0NWZ+amuptCwaDSklJCdneunVrJScne/v8UH5+vvx+v7ekp6dHctgAAMAQE+82ysvLU1VVlbccPXo01kMCAAAxEtF4CQQCkqTy8vKQ9eXl5d62QCCgEydOhGyvq6tTRUWFt88PJSYmyufzhSwAAKBlimi89OrVS4FAQAUFBd666upq7dq1S5mZmZKkzMxMVVZWqqioyNtn8+bNqq+vV0ZGRiSHAwAArkFhv9uopqZGhw4d8r4uKytTcXGxkpOT1b17d82bN0//93//pxtvvFG9evXS008/rbS0NE2YMEGS1K9fP2VnZ2vWrFlauXKlzp8/r7lz52ry5Mm80wgAAFxW2PHyxRdf6IEHHvC+zs3NlSRNnz5da9as0ZNPPqkzZ85o9uzZqqys1N13362NGzeqbdu23mPeeustzZ07V8OHD1d8fLwmTZqkl19+OQKnAwAArnVxzjkX60GEq7q6Wn6/X1VVVbz+BbiMngs+jPUQgIg5vHRMrIeAZojUv98m3m0EAABwEfECAABMIV4AAIApxAsAADCFeAEAAKYQLwAAwBTiBQAAmEK8AAAAU4gXAABgStgfDwC0ZPy2WgCIPZ55AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGBKxOPld7/7neLi4kKWvn37etvPnj2rnJwcderUSdddd50mTZqk8vLySA8DAABco6LyzMvNN9+s48ePe8v27du9bY899pg++OADvfvuu9q6dauOHTumiRMnRmMYAADgGtQ6Kgdt3VqBQKDB+qqqKr3++utau3atHnzwQUnS6tWr1a9fP+3cuVN33HFHNIYDAACuIVF55uXgwYNKS0vT9ddfr6lTp+rIkSOSpKKiIp0/f15ZWVnevn379lX37t1VWFh4yePV1taquro6ZAEAAC1TxOMlIyNDa9as0caNG7VixQqVlZXpnnvu0enTpxUMBpWQkKCkpKSQx6SmpioYDF7ymPn5+fL7/d6Snp4e6WEDAAAjIv5jo1GjRnl/HjhwoDIyMtSjRw/99a9/Vbt27Zp0zLy8POXm5npfV1dXEzAAALRQUX+rdFJSkm666SYdOnRIgUBA586dU2VlZcg+5eXljb5G5qLExET5fL6QBQAAtExRj5eamhqVlpaqa9euGjJkiNq0aaOCggJve0lJiY4cOaLMzMxoDwUAAFwDIv5jo8cff1xjx45Vjx49dOzYMT3zzDNq1aqVpkyZIr/fr5kzZyo3N1fJycny+Xx69NFHlZmZyTuNAACX1XPBh7EeQtgOLx0T6yFccyIeL99++62mTJmiU6dOqUuXLrr77ru1c+dOdenSRZK0bNkyxcfHa9KkSaqtrdXIkSP16quvRnoYAADgGhXnnHOxHkS4qqur5ff7VVVVxetfcEVZ/L8+ALHFMy//Fal/v/lsIwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwpXWsB4CWq+eCD2M9BACAQTzzAgAATCFeAACAKcQLAAAwhde8AAAQRVZf33d46ZhYD+GSeOYFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMIV4AQAAphAvAADAFOIFAACYQrwAAABTiBcAAGAK8QIAAEwhXgAAgCnECwAAMKV1rAeAyLD6kesAAISLZ14AAIApxAsAADCFeAEAAKYQLwAAwBTiBQAAmEK8AAAAU4gXAABgCvECAABMIV4AAIApxAsAADCFeAEAAKYQLwAAwBQ+mLERfMghAABXL555AQAApsQ0XpYvX66ePXuqbdu2ysjI0O7du2M5HAAAYEDM4uWdd95Rbm6unnnmGX355ZcaNGiQRo4cqRMnTsRqSAAAwICYxcuLL76oWbNm6ZFHHlH//v21cuVKtW/fXqtWrYrVkAAAgAExecHuuXPnVFRUpLy8PG9dfHy8srKyVFhY2GD/2tpa1dbWel9XVVVJkqqrq6Myvvraf0XluAAAWBGNf2MvHtM516zjxCRevvvuO124cEGpqakh61NTU3XgwIEG++fn52vx4sUN1qenp0dtjAAAtGT+l6J37NOnT8vv9zf58SbeKp2Xl6fc3Fzv6/r6elVUVKhTp06Ki4uT9J+aS09P19GjR+Xz+WI11KsG8xGK+WiIOQnFfDTEnIRiPkI1ZT6cczp9+rTS0tKa9b1jEi+dO3dWq1atVF5eHrK+vLxcgUCgwf6JiYlKTEwMWZeUlNTosX0+HxfV9zAfoZiPhpiTUMxHQ8xJKOYjVLjz0ZxnXC6KyQt2ExISNGTIEBUUFHjr6uvrVVBQoMzMzFgMCQAAGBGzHxvl5uZq+vTpGjp0qIYNG6aXXnpJZ86c0SOPPBKrIQEAAANiFi+/+MUvdPLkSS1atEjBYFC33nqrNm7c2OBFvD9WYmKinnnmmQY/XmqpmI9QzEdDzEko5qMh5iQU8xEqlvMR55r7fiUAAIAriM82AgAAphAvAADAFOIFAACYQrwAAABTrpp4Wb58uXr27Km2bdsqIyNDu3fv/lGPW7duneLi4jRhwoSQ9c45LVq0SF27dlW7du2UlZWlgwcPhuxTUVGhqVOnyufzKSkpSTNnzlRNTU2kTqlZIjkf58+f1/z58zVgwAB16NBBaWlpevjhh3Xs2LGQx/bs2VNxcXEhy9KlSyN5Ws0S6WtkxowZDc43Ozs7ZJ+Wco1IajAXF5fnn3/e2+dqvkbCmY81a9Y0OI+2bduG7GP9HiJFdk6uhftIpK8R6/cQKfJzcsXuI+4qsG7dOpeQkOBWrVrl9u3b52bNmuWSkpJceXn5/3xcWVmZ+8lPfuLuueceN378+JBtS5cudX6/361fv979/e9/d+PGjXO9evVy//73v719srOz3aBBg9zOnTvdZ5995m644QY3ZcqUaJxiWCI9H5WVlS4rK8u988477sCBA66wsNANGzbMDRkyJOTxPXr0cEuWLHHHjx/3lpqammicYtiicY1Mnz7dZWdnh5xvRUVFyD4t5RpxzoXMw/Hjx92qVatcXFycKy0t9fa5Wq+RcOdj9erVzufzhZxHMBgM2cfyPcS5yM+J9ftINK4Ry/cQ56IzJ1fqPnJVxMuwYcNcTk6O9/WFCxdcWlqay8/Pv+Rj6urq3J133un+/Oc/u+nTp4fciOvr610gEHDPP/+8t66ystIlJia6t99+2znn3P79+50k9/nnn3v7fPTRRy4uLs7985//jODZhS/S89GY3bt3O0num2++8db16NHDLVu2rLnDj4pozMnl5qmlXyPjx493Dz74YMi6q/UaCXc+Vq9e7fx+/yWPZ/0e4lzk56Qxlu4j0ZgPy/cQ567MNRKt+0jMf2x07tw5FRUVKSsry1sXHx+vrKwsFRYWXvJxS5YsUUpKimbOnNlgW1lZmYLBYMgx/X6/MjIyvGMWFhYqKSlJQ4cO9fbJyspSfHy8du3aFYlTa5JozEdjqqqqFBcX1+AzopYuXapOnTpp8ODBev7551VXV9ek84ikaM7Jli1blJKSoj59+mjOnDk6deqUt60lXyPl5eX68MMPG933artGmjofNTU16tGjh9LT0zV+/Hjt27fP22b5HiJFZ04aY+U+Es35sHgPka7MNRLN+0jMP1X6u+++04ULFxr8Zt3U1FQdOHCg0cds375dr7/+uoqLixvdHgwGvWP88JgXtwWDQaWkpIRsb926tZKTk719YiEa8/FDZ8+e1fz58zVlypSQD9P67W9/q9tuu03JycnasWOH8vLydPz4cb344otNPp9IiNacZGdna+LEierVq5dKS0v11FNPadSoUSosLFSrVq1a9DXyxhtvqGPHjpo4cWLI+qvxGmnKfPTp00erVq3SwIEDVVVVpRdeeEF33nmn9u3bp27dupm+h0jRmZMfsnQfidZ8WL2HSFfmGonmfSTm8RKu06dPa9q0afrTn/6kzp07x3o4MRfufJw/f14///nP5ZzTihUrQrbl5uZ6fx44cKASEhL061//Wvn5+aZ+HfaPnZPJkyd7fx4wYIAGDhyo3r17a8uWLRo+fPiVGOoV0ZS/M6tWrdLUqVMbvBjvWrlGMjMzQz4E9s4771S/fv302muv6fe//30MRxY74cxJS7iP/Jj5aCn3kIvC/XsTzftIzOOlc+fOatWqlcrLy0PWl5eXKxAINNi/tLRUhw8f1tixY7119fX1kv5TtCUlJd7jysvL1bVr15Bj3nrrrZKkQCCgEydOhBy7rq5OFRUVjX7fKyUa89G7d29J/73hfPPNN9q8efNlP8I8IyNDdXV1Onz4sPr06dPcU2uyaM7J911//fXq3LmzDh06pOHDh7fIa0SSPvvsM5WUlOidd9657Fiuhmsk3PloTJs2bTR48GAdOnRIkkzfQ6TozMlFFu8j0ZyP77NyD5GiPyfRvo/E/DUvCQkJGjJkiAoKCrx19fX1KigoCCm8i/r27au9e/equLjYW8aNG6cHHnhAxcXFSk9PV69evRQIBEKOWV1drV27dnnHzMzMVGVlpYqKirx9Nm/erPr6emVkZETxjP+3aMyH9N8bzsGDB/XJJ5+oU6dOlx1LcXGx4uPjGzzteaVFa05+6Ntvv9WpU6e8f6xa2jVy0euvv64hQ4Zo0KBBlx3L1XCNhDsfjblw4YL27t3r/be3fA+RojMnkt37SLTm44es3EOk6M9J1O8jzXq5b4SsW7fOJSYmujVr1rj9+/e72bNnu6SkJO8tWNOmTXMLFiy45OMbe8X30qVLXVJSknv//ffdnj173Pjx4xt9m+PgwYPdrl273Pbt292NN954VbyFLdLzce7cOTdu3DjXrVs3V1xcHPL2tNraWuecczt27HDLli1zxcXFrrS01P3lL39xXbp0cQ8//HBUz/XHivScnD592j3++OOusLDQlZWVuU8++cTddttt7sYbb3Rnz5719msp18hFVVVVrn379m7FihUNtl3N10i487F48WL38ccfu9LSUldUVOQmT57s2rZt6/bt2+ftY/ke4lzk58T6fSTS82H9HuJcdP7eOHdl7iNXRbw459wrr7ziunfv7hISEtywYcPczp07vW333Xefmz59+iUf29iNuL6+3j399NMuNTXVJSYmuuHDh7uSkpKQfU6dOuWmTJnirrvuOufz+dwjjzziTp8+HcnTarJIzkdZWZmT1Ojy6aefOuecKyoqchkZGc7v97u2bdu6fv36uWeffTbkL2GsRXJO/vWvf7kRI0a4Ll26uDZt2rgePXq4WbNmNfidBS3lGrnotddec+3atXOVlZUNtl3t10g48zFv3jxv39TUVDd69Gj35ZdfhhzP+j3EucjOybVwH4nkfFwL9xDnIv/3xrkrcx+Jc865H/88DQAAQGzF/DUvAAAA4SBeAACAKcQLAAAwhXgBAACmEC8AAMAU4gUAAJhCvAAAAFOIFwAAYArxAgAATCFeAACAKcQLAAAwhXgBAACm/D+WBHMjpxOuNAAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#from solutions\n",
    "#1 \n",
    "import quandl\n",
    "xrate = quandl.get(\"RBA/FXRUSD-AUD-USD-Exchange-Rate\")\n",
    "\n",
    "#2\n",
    "xdelta = xrate.diff().dropna()\n",
    "xdelta['lag'] = xdelta['Value'].shift(1)\n",
    "xdelta = xdelta.dropna()\n",
    "xdelta['matched_movement'] = xdelta['Value']*xdelta['lag'] > 0 # compare values to lagged values\n",
    "samp_prop = np.sum(xdelta['matched_movement'])/xdelta.shape[0] # get proportion of movements that match the previous movement\n",
    "samp_prop\n",
    "\n",
    "\n",
    "#Extended Exercise\n",
    "\n",
    "iters = range(1000) #Repeat our experiment 1000 times.\n",
    "results = []\n",
    "distribution = stats.norm(0, 1)\n",
    "\n",
    "for i in iters:\n",
    "    X = distribution.rvs(365) # create randomly distributed data for a year\n",
    "    data = pd.DataFrame([X[i] for i in range(len(X))])\n",
    "    data.columns = ['Value']\n",
    "    data['lag'] = data['Value'].shift(1)\n",
    "    data = data.dropna()\n",
    "    data['matched_movement'] = data['Value']*data['lag'] > 0 # compare values to lagged values\n",
    "    data_prop = np.sum(data['matched_movement'])/data.shape[0]\n",
    "    results.append(data_prop)\n",
    "\n",
    "plt.hist(results)\n",
    "\n",
    "t_stat, p_val = stats.ttest_1samp(results, 0.5)\n",
    "print('t stat: {:.4f}, p value: {:4f}'.format(t_stat, p_val))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.48459915611814347"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samp_prop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "t stat: 18.4573, p value: 0.000000\n"
     ]
    }
   ],
   "source": [
    "t_stat, p_val = stats.ttest_1samp(results, samp_prop)\n",
    "print('t stat: {:.4f}, p value: {:4f}'.format(t_stat, p_val))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*For solutions, see `solutions/hypothesis_two.py`*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tests for specific attributes\n",
    "\n",
    "Often it is of use to know that data has a certain property, for instance it is normally distributed, correlated, or whether two samples are basically the same or not.\n",
    "\n",
    "\n",
    "### Correlation between variables\n",
    "\n",
    "These tests are designed to test whether two samples of data are independent of each other, or if a dependency exists:\n",
    "\n",
    "$H_0$ the two samples tested are independent from each other\n",
    "\n",
    "$H_A$ the two samples tested have a dependency between them\n",
    "\n",
    "* Spearman's Rank Correlation, implemented as `scipy.stats.spearmanr`\n",
    "* Pearson's correlation coefficient, implemented as `scipy.stats.pearsonr`\n",
    "* Chi-Squared test, implemented as `scipy.stats.chi2_contingency` and `statsmodels.stats.proportion.proportions_chisquare`\n",
    "\n",
    "\n",
    "### Gaussian Distribution Tests\n",
    "\n",
    "There are a few tests designed to test that a distribution is Gaussian (normal). They include:\n",
    "\n",
    "* The Shapiro-Wilk test, implemented as `scipy.stats.shapiro`\n",
    "* D’Agostino’s $K^2$, implemented as `scipy.stats.normaltest`\n",
    "* Kolmogorov-Smirnov, implemented as `scipy.stats.kstest` and `statsmodels.stats.diagnostic.kstest_normal`\n",
    "* Anderson-Darling, implemented as `scipy.stats.anderson`\n",
    "\n",
    "Each of the above tests against the hypothesis:\n",
    "\n",
    "$H_0$ the data has a Gaussian distribution\n",
    "\n",
    "$H_A$ the data does not have a Gaussian distribution\n",
    "\n",
    "\n",
    "### Are two samples equal?\n",
    "\n",
    "These tests assert that, given two samples, they are effectively equal (i.e. they came from the same distribution):\n",
    "\n",
    "* Student's t-test, as identified earlier, implemented in quite a few methods in both scipy and statsmodels\n",
    "* Analysis of Variance Test (ANOVA), `scipy.stats.f_oneway` and `statsmodels.api.stats.anova_lm` (among a few other ways to call it).\n",
    "* Mann-Whitney U Test, implemented as `scipy.stats.manwhitneyu`\n",
    "* Wilcoxon Signed Rank Test, implemented as `scipy.stats.wilcoxon`\n",
    "\n",
    "\n",
    "Note that while the tests in these categories have the same purpose, they are not the same in terms of quality, speed, and even coding signatures! Always check the documentation for the function you are using first, before using it in practice.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Extended Exercise\n",
    "\n",
    "Using a simulation, create your own function that can compute the t-test and p values for a Student's t-test.\n",
    "\n",
    "For a comparison of two independent samples (i.e. \"here are two samples, do they come from the same distribution?\"), the t value is computed as:\n",
    "\n",
    "$ t(X_1, X_2) = \\frac{\\bar{X_1} - \\bar{X_2}}{s}$\n",
    "\n",
    "\n",
    "Where:\n",
    "\n",
    "* $\\bar{X_1}$ is the mean of sample $X_1$\n",
    "* s is the standard error of the difference, which is:\n",
    "\n",
    "$e_1 = \\frac{\\sigma_1}{\\sqrt(n_1)}$\n",
    "\n",
    "Where $\\sigma_1$ is the standard deviation of $X_1$ and $n_1$ is the number of observations in $X_1$, This is the \"standard error\" of $X_1$.\n",
    "\n",
    "Then,\n",
    "\n",
    "$s = \\sqrt{e_1^2 + e_2^2}$\n",
    "\n",
    "Which is the standard error of the difference between the means.\n",
    "\n",
    "The output of your code should be a pandas DataFrame where the index values are the p-values we are testing (i.e. 0.01, 0.05, 0.1, 0.2) and the columns are the degrees-of-freedom, which is how many data points in both $X_1$ and $X_2$, subtracting 2. Values to compute are 5, 10, 20, 50, 100 (and so on if you are inclined).\n",
    "\n",
    "The values can be computed via simulation - that is, draw many random samples, and compute the likelihood of getting a t value at least that high between them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*For solutions, see `solutions/simulation_ttest.py`*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
